892 research outputs found

    Asymptotic behaviour of the posterior distribution in overfitted mixture models.

    Get PDF
    In this paper we study the asymptotic behaviour of the posterior distribution in a mixture model when the number of components in the mixture is larger than the true number of components, a situation commonly referred to as overfitted mixture. We prove in particular that quite generally the posterior distribution has a stable and interesting behaviour, since it tends to empty the extra components. This stability is achieved under some restriction on the prior, which can be used as a guideline for choosing the prior. Some simulations are presented to illustrate this behaviour.posterior concentration; mixture models; overfitting; Asymptotic; Bayesian;

    Bayesian nonparametric dependent model for partially replicated data: the influence of fuel spills on species diversity

    Get PDF
    We introduce a dependent Bayesian nonparametric model for the probabilistic modeling of membership of subgroups in a community based on partially replicated data. The focus here is on species-by-site data, i.e. community data where observations at different sites are classified in distinct species. Our aim is to study the impact of additional covariates, for instance environmental variables, on the data structure, and in particular on the community diversity. To that purpose, we introduce dependence a priori across the covariates, and show that it improves posterior inference. We use a dependent version of the Griffiths-Engen-McCloskey distribution defined via the stick-breaking construction. This distribution is obtained by transforming a Gaussian process whose covariance function controls the desired dependence. The resulting posterior distribution is sampled by Markov chain Monte Carlo. We illustrate the application of our model to a soil microbial dataset acquired across a hydrocarbon contamination gradient at the site of a fuel spill in Antarctica. This method allows for inference on a number of quantities of interest in ecotoxicology, such as diversity or effective concentrations, and is broadly applicable to the general problem of communities response to environmental variables.Comment: Main Paper: 22 pages, 6 figures. Supplementary Material: 11 pages, 1 figur

    Model choice versus model criticism

    Full text link
    The new perspectives on ABC and Bayesian model criticisms presented in Ratmann et al.(2009) are challenging standard approaches to Bayesian model choice. We discuss here some issues arising from the authors' approach, including prior influence, model assessment and criticism, and the meaning of error in ABC.Comment: This is a comment on the recent paper by Ratmann, Andrieu, Wiuf, and Richardson (PNAS, 106), submitted too late for PNAS to consider i

    Computationally Efficient Simulation of Queues: The R Package queuecomputer

    Get PDF
    Large networks of queueing systems model important real-world systems such as MapReduce clusters, web-servers, hospitals, call centers and airport passenger terminals. To model such systems accurately, we must infer queueing parameters from data. Unfortunately, for many queueing networks there is no clear way to proceed with parameter inference from data. Approximate Bayesian computation could offer a straightforward way to infer parameters for such networks if we could simulate data quickly enough. We present a computationally efficient method for simulating from a very general set of queueing networks with the R package queuecomputer. Remarkable speedups of more than 2 orders of magnitude are observed relative to the popular DES packages simmer and simpy. We replicate output from these packages to validate the package. The package is modular and integrates well with the popular R package dplyr. Complex queueing networks with tandem, parallel and fork/join topologies can easily be built with these two packages together. We show how to use this package with two examples: a call center and an airport terminal.Comment: Updated for queuecomputer_0.8.

    Phase randomisation: a convergence diagnostic test for MCMC

    Get PDF
    Most MCMC users address the convergence problem by applying diagnostic tools to the output produced by running their samplers. Potentially useful diagnostics may be borrowed from diverse areas such as time series. One such method is phase randomisation. The aim of this paper is to describe this method in the context of MCMC, summarise its characteristics, and contrast its performance with those of the more common diagnostic tests for MCMC. It is observed that the new tool contributes information about third and higher order cumulant behaviour which is important in characterising certain forms of nonlinearity and nonstationarity.Convergence diagnostics; higher cumulants; Markov Chain Monte Carlo; non-linear time series; stationarity; surrogate series
    corecore